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ABSTRACT 


This thesis describes the design of a built-in self-test capability 
for a military airborne digital computer. The supportive investigation 
of program constraints and their effects on the example test desian is 
intended to give broad perspective to the general self-test desian 
problem. Alternate procedures for achieving the goal of airborne 
detection and isolation of a certain class of failures to the modular 
level are surveyed. A specific test desian is evolved illustrating the 
unique mix of program-oriented, periodic techniques, and added hardware, 
continuous techniques best suited to the example development proaram. 


The test design is evaluated and further work is suagested. 
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IT. INTRODUCTION 


Maintenance and repair of faulty electronic equipment have always 
been the less glamorous companions of design and oneration. Indeed, 
the subjects were often broached only after design concepts were 
formed and specific circuitry develoned. The evolution of increasinaly 
complex electronic systems, such as digital computers, has forced 
greater and earlier consideration of the problems of locating failures 
and correcting them. A digital computer which automatically tests 
itself for proper operation and which provides valuable information to 
facilitate maintenance and repair has become very attractive for mili- 
tary and space systems applications. This thesis reports the results 
of an investigation to provide such automatic self-checking for a 
digital computer system. 

A project which considerably supported the investigation was accom- 
plished at Hughes Aircraft Company, Culver City, California, during an 
industrial experience tour. The project goal of designing a built-in 
self-test (BIT) capability for an advanced airborne digital computer 
system for military application was more fully realized because BIT 
was accented as a princinal design consideration early in the architec- 
tural design procedure. The specific desiqn develoned will be used 
as an example; however, the test procedures will be recognized as 
being more generally applicable to the class of digital computers for 
which the assumptions and constraints applied herein can be validated. 
Only one of many possible solutions to the fault detection and isolation 
problem will be presented. The choice made should not be construed to 


reflect official policy at Hughes Aircraft Company. 


Some general comments at the outset should place this investigation 
in proper perspective and temper expectation with praqmatism. The inves- 
tigation has as its central focus the specific BIT design develoned; 
however, it 1S intended to consider the broader systems design options 
available, thereby showing the example design in better versnective. 

As Sellers, Hsiao and Bearnson [Ref. 43] so aptly observe, one should 
initially set reasonable design objectives relative to the thoroughness 
of test, recognizing that exhaustive automatic test is an almost 
unattainable practical goal. As part of a computer development pro- 
gram, the BIT design is subject to the larger program objectives and 
constraints. The first part of this investigatton wtll define the test 
design problem in more specific terms. Subject to practical limitations, 
a reasonable set of test objectives will be developed. Once objectives 
have been focused, alternatives for implementation will be considered 
and a test concept evolved. Specific test procedures will be presented 
for automatically testing the digital computer. Finally, the results 
obtained will be critically evaluated in light of the design objectives, 


and further related work will be suqgested. 


II. PROBLEM DEFINITION AND DESIGN OBJECTIVES 


A. BROAD GOALS 

Given the framework of a digital computer in a military avionics 
application, one can identify three broad goals for a self-test 
capability: 


1. To decrease the cost of ownership by reducing maintenance cost/- 
time and increasing system availability. 


2. 10 indicate to the pilot in flight the level of system operational 
capability available to him. 


3. To provide limited assistance through self-test in prototvne 
design and checkout. 


Any information relative to the existence and location of failure wil] 
reduce the time spent (and hence cost) to renair the computer and 
therefore increase the aircraft's availability for onerational purnoses. 
Airborne indications of system degradation through failure allow the 
pilot to make timely and informed choices of alternatives to ontimize 
the probability of successful mission completion. Lastly, self-test 
during computer development assists the engineer to more quickly iden- 
tify and correct design and hardware faults. In short, BIT is desiqned 
to nrovide a greater system effectiveness at a lower cost; that is, to 


increase cost-effectiveness. 


B. PROGRAM CONSTRAINTS 
leesreot bil 
In a very real sense, the dominating factor effecting BIT design 
problem definition is cost. Cost has several facets. The cost of BIT 
1s considered to be part of the overall computer vroqram price taq. 


Required performance criteria for the completed comnuter system are 


specified by the sponsoring government agency to the aerospace industry. 
A participating comnany must strive to reduce its proposed system's 
cost while meeting or exceeding specifications to remain competitive. 
So within the overall proaram development and production cost, the 
contribution of BIT must be justified and minimized. Since the broad 
goal of increased cost-effectiveness has been identified for BIT, 
justification includes critical assessment of the added cost to the 
computer program of providing a self-test capability to ensure that a 
compensatory benefit in reduced cost of ownership will be realized. 

Sources of added cost for BIT include but are not limited to 
the fol lowina: 


1. The checking hardware itself 
2. Additional power required 


3. Greater capacity logic to provide for the added checking 
hardware; e.a., drivers with greater fanout 


4. Additional data lines to provide for test hardware and 
procedures 


5. Storage capacity required for BIT routines and data 
6. Design, programming, and development costs 


Nther "costs", often translated into dollar values, include the 
penalties (if any) attached to increased size and weight of an airborne 
computer provided with BIT capability. For an air superiority fiahter 


: : ] 
application, these nenalties are severe. 


‘hughes Aircraft Co..uses internally aenerated weighting factors 
of $500/1b and $5000/ft? for added hardware. To illustrate using 
these typical penalties, two computers are compared: 

1) AO.5 ft3, 25 1b computer costing $50k 
2) A 0.4 ft3, 20 1b computer costina $52.5k 
The penalties added to computer (1) are: 


0.1 ft? x $5000/ft’ = $500 #£=for volume 
5 1b x $500/1b = $2500 for weight 
Total = $3000 nenalty 


Computer (2), though ostensibly costing more, is $500 less expensive 
after nenalties are applied. 
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The benefits of BIT can also be reduced to monetary terms by 
operations analysis techniques. Projected maintenance experience, spare 
parts costs, inventory levels, and the effects of maintenance concepts 
can all be given dollar values. However, the relative weight that 
increased system operational availability receives is more subjective. 
In a Space system, for example, there is a very high premium on avail- 
ability; in a military airborne system, availability is important but 
Novas Critical . 

The result on the overall cost of ownership for the military 
system is that, while the penalties for providing BIT are quite clear, 
the benefits are harder to evaluate and therefore less visible. Even 
when a Clear long-term reduction in cost of ownership can be expected, 
insufficient available funding may force procurement of a less expen- 
sive option without a BIT capability. The effect on BIT desian is to 
place emphasis on minimizing the more visible penalties, reducina them 
to an acceptable fixed percentage of the system cost without a BIT 
capability.< 

2. The Parent Computer 

The nature of the computer for which the self-test capability 
is to be provided certainly has a large influence on the BIT desian 
objectives. For the example design, the characteristics for the 
parent computer evolved from the original specifications and the 


subsequent company policy decisions. The parent computer was to: 


~Fstimates in the literature range from 3% cost increase for BIT 
for a commercial machine to over 300% for a triplicated space 
system computer. A figure of 10% fell in the qeneral area of 
acceptability at Hughes Aircraft Company for this project. 


Have a military avionics application 

Be modular 

Have flexible word length 

Be non-redundant 

Be repaired on the ground, not in the air 

Have minimal storage capacity 

Suffer no operational degradation because of BIT 
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Be developed on short schedule at low risk 
Each of these characteristics will be more thoroughly discussed. 

A military avionics application implies that size and weiaht 
are to be minimized consistent with the cost penalties discussed earlier. 
It also implies high speed, real-time computation. The more rigid 
military specifications concerning operating temperatures, humidity, 
shock resistance and other severe environmental factors affect the 
quality of components used and the packaging of these components at 
all levels. 

The comnuter was to be of modular construction, the term module 
referring to a standardized plug-in circuit card with a given surface 
area and number of pin connectors. The Naval Avionics Facility 
Indianapolis (NAFI) has developed a series of modules designed to be 
acceptable as the basic building blocks for many military applications 
(Ref. 10]. The basic "NAFI module" chosen for use in the parent com- 
puter (with some modifications) was the "2A" size whose imnortant fea- 
tures relative to BIT design are dimensions of roughly five (5) inches 
in length and two (2) inches in heiqht (both sides may be used for 
mounting hardware) and 80 pins in the two bottom connectors. Figure 1, 
derived from Ref. 10, depicts the 2A NAFI module. The module's surface 


area and number of pins place limitations on (1) the amount of hardware 


which will physically fit on the module (heat dissination is a related 
problem), and (2) on the number of external, intermodular electrical 
paths available. The level of solid state technology of the imnlementing 
circuitry determines whether the area or pin limitation dominates. For 
example, circuitry consisting of discrete components (separate trans- 
istors, capacitors, resistors) tends to impose an area limitation because 
the relatively large size of individual components limits the number 
which can be accommodated in the fixed area, before the available pin 
connectors are exhausted. At the other extreme, circuitry implemented 
using large scale integration (LSI) technology, in which perhans 1000 

or more gates are placed on a single silicon chip [Ref. 48], requires 
little mounting surface area. The number of external connections 

needed, however, can be large. Hence, in the latter case a pin limi- 
tation exists. In between these extremes fall the integrated circuit 
(IC) and medium scale integration (MSI) technological levels which may 

be area or pin limited for specific modules. The size of the modular 
partition chosen for the parent computer and the predominantly IC/MSI 
technology utilized will be seen to have a significant effect on BIT 
design. 

Partitioning of the parent computer was not otherwise speci- 
fied, except that the computer's basic design was to be readily 
adaptable for differing word length anplications (specifically, 
multiples of eight bits, up to a 32-bit word length) without major 
redesign of the original modules. The exnected initial apnlication 
of the parent computer specified a 24-bit word lenath; this word lenath 


will be used in the example design. 


The parent computer was to be essentially non-redundant; that 
is, no general replication of hardware at any level was intended. This 
constraint arose from cost considerations. Penalties in the additional 
hardware cost, increased size and weight associated with redundancy 
were deemed unacceptable. Additionally, the mean time between failures 
(MTBF) of the computer tends to be several times higher than the MTBF 
of the equipment which the computer serves; e.g., a radar.> 

A closely related characteristic dictated ground repair of 
failures. No automatic reconfiguration under failure or fault-masking 
was intended, since such self-repair generally requires some redun- 
dancy. Airborne personnel to effect maintenance would not be available 
in the type aircraft for which application was projected. Access, 
removal of shielding, and dust-free repair would be difficult airborne. 
Built-in test was therefore restricted to detection and isolation of 
faults, and was not intended to include a self-repair capability. 

The requirement for minimal storage capacity was again related 
to cost. Random access storage such as core memory is expensive in 
hardware, size, weight, and power requirements. No peripheral bulk 
Storage devices such as drum, disc, or tape were to be available. The 
effect of these characteristics of the parent computer on the design 
of BIT is significant. The dedication of memory bit locations to 
Storage of error detecting codes, such as narity or residue, is elim- 


inated from consideration because of the attendant reduction in word 


3Reference 34 shows MTBF's in the 100's of hours for the P-I11A 
weapons system avionics equipments. MITBF’s for airborne computers, 
as shown by marketing brochures, are typically in the 1000's of hours. 


lenath available to the fliaht proaram. Increased word length is 
unacceptable because of the greater storage requirement and hiaher cost. 
Coding is a widely used technique for detecting data transfer errors. 

The storage of software self-test programs and data in core-memory is 
also virtually eliminated from the list of often-used test tools. The 
core memory, then, is reserved for the flight program and for operational 
use with negligible capacity available for BIT use. 

Any self-test capability is not allowed to degrade the real-time 
operational efficiency of the computer in speed or availability. The 
effect of this requirement is to prohibit the insertion of test hard- 
ware in operational propagation paths because of the delays thereby 
introduced. Additionally, any sequential, nrogram-oriented test routines 
would have to be exercised on a time-shared basis with ongoing tactical 
operations in available short blocks of “idle” time. Such routines would 
therefore have to be interruptable without destroyina test efficacy 
So that the machine could be returned to operational computation immed- 
iately, whenever required. 

The overall computer program called for a short develonment 
schedule with low risk to the company. These constraints dictate the 
use of existing techniques and designs wherever feasible. No completely 
new technology could be developed within schedule requirements. Off- 
the-shelf hardware components would be primarily used because of the 
risks attendant in meeting a short schedule with components notentially 
available from outside suppliers at production time but still under 


development during computer design. 
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Ce BITBESIGN OBJECTIVES 

With the aforementioned broad goals set forth and the constraints 
imposed on self-test design by the nature of the larger program more 
clearly defined, realistic BIT design objectives can be developed. The 
maintenance problem would be most significantly assisted if faults 
could be isolated to the plug-in card, or modular level. Sub-modular 
fault isolation, while desirable from the standpoint of higher echelon 
maintenance, does not contribute any more significantly to increased 
aircraft availability since the faulty module must be removed in either 
case. Conformal coating for environmental vrotection applied to cir- 
cuitry within the module makes removal of sub-modular components a 
difficult and specialized task inanpropriate at the immediate squadron 
(1st echelon) level. Higher level isolation would require renlacement 
of large and more expensive units of the computer. Stocking of spare 
parts at the module level seems reasonable for the squadron shop both 
in the inventory costs involved and the volumes required. Of course, 
commonality among modules reduces the different types to be stocked and 
is desirable. These heuristic arguments can be quantized, but the views 
presented should suffice to intuitively support the decision to set 
fault isolation to the modular level as a BIT desian objective. 

Since no airborne repair, manual or automatic, is required, 
reporting of faults detected within specific modules completes the 
self-test task. A compatible design objective, supporting the second 
and third broad goals related to pilot notification of failure and aid 


to prototype development, is to rapidly indicate the specific modular 
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location of existing faults to a central location for immediate use by 
the pilot and later use by maintenance personnel upon mission termin- 
ation. 

The BIT design objectives and major constraints can now be summa- 
rized. The BIT design should automatically detect failures in the 
computer and isolate them to the modular level airborne. The modular 
location of such failures should be rapidly reported to a central loca- 
tion. The design should be minimized as to cost, require negligible 
core memory storage, utilize no coding techniques requiring storage 
capacity, and inflict no operational degradation on the computer's sneed 
and availability. All this should be accomplished on short schedule 
and at low risk. While these objectives and constraints for a self- 
test design are imposing, they are not atypical of the requirements 
of a military airborne system. Just what constitutes the failure to 


be detected can now be examined. 
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ITT. THE NATURE OF FAILURE 


Since the objective of "fault detection" has been set, its meaning 
should be explained. This section will consider what constitutes a 
fault and will define several related terms. The literature is replete 
with descriptive terms such as catastrophic, intermittent, solid, 
transient, burst, marginal, multiple, insipient, minor, and gross, applied 
to fault and the related terms failure, error and nauGune Galena The 


terms "fault," "failure," and "malfunction" will be used synonymously 
to mean a physical defect in equipment which causes that equipment to 
perform in an unsatisfactory manner. The substandard performance 
usually resulting from a fault will be termed an "error." Another way 
of stating this is to say that an error is an incorrect result. The 
terms "solid" and "intermittent" will be used to characterize the dura- 
tion of the error, and by inference, the failure causing the error. 

A solid error will refer to an error which results from a failure which 
persists; a solid error will consistently recur under the same equip- 
ment conditions. An intermittent error will be one which is of short or 
transient duration and is non-persistent; that is, an intermittent 
error does not consistently recur given the same conditions. The terms 
"catastrophic" and "transient" are often used to describe these two 
categories of error, but they will not generally be used herein. The 


idea of degrees of failure is introduced by such terms as marginal, 


1 good discussion of some typical terminology surrounding "failure" 
is found in Ref. 24. 


Single or multiple, minor or gross. The term “marainal" will be re- 
served to describe a category of testing. The terms "sinale" and 
"multiple" will refer to one failure or error, and to more than one 
failure or error, respectively. 

Erroneous results can arise from sources other than equipment 
failure. Programming inaccuracies and human operator mistakes will 
not be considered to be error within the scope of this investigation. 
Equipment failure leading to erroneous results represents the class of 
faults to be detected by the design test techniques. Inaccurate intra- 
computer data transmission, faults in logic, failures in core memory, 
and failed test circuitry are representative of faults within this 
class of interest. 

Certain types of equipment, generally termed "hard-core," serve 
the entire computer and must operate properly if the computer is to 
function at all. Examples of such equipment are main power sunplies, 
clockina circuitry, cooling equipment and other mechanical components 
such as electromagnetic interference shielding. Faults in this hard- 
core equipment have been effectively identified by voltage/temperature 
sensing devices which continuously compare performance to preset toler- 
ances, and similar well-known techniques [Ref. 46]. Faults in the types 
of hard-core equipment described above will not be considered to be part 
of the BIT detection and isolation task as defined herein. The main 
thrust of this investigation will treat the less adequately resolved 


problems of identifving and locating all possible failures in the loaic 


circuitry, storage, data transmission paths, checkina hardware and 
other equipment which is not hard-core in the previous sense of provid- 
ing "housekeeping" and utility services.> 

Faults are usually identified by detecting the resultant errors. 

If a fault does not produce erroneous results, its existence is of 
little immediate consequence. For example, a shorted transistor always 
causing an output to be tn the low voltage level (the zero of nositive 
logic having the binary logical states one and zero) does not become 
Sianificant until the hiah voltage level represents the proper output 
value. In other words, a stuck-at-zero failure is not important until 
the proper result should be a logical one. Conversely, as previously 
mentioned, all errors are not the result of equipment failure (e.q., 
operator mistakes), but some of these appear to be the result of equip- 
ment failure. Eaqutpment fatlure modes should be examined to identify 
those of interest to the test design. 

Assuming transistor buildina blocks (discrete, IC, MSI, or LSI 
technoloay) for the example computer loaic (vice cryodenics or some 
other technoloay), some of the possible failure modes are: 

1. Inputs or outputs stuck at the hiah or low voltage levels 

(stuck-at-one, stuck-at-zero). Inputs stuck above the hiah 


level or below the low level, a possible condition in some 
computers, have the same effect. 


2. Inputs or outputs stuck at an indeterminate, intermediate 
level between the hiah and low voltage levels. Indeterminate 
voltage levels miaht sometimes be internreted as a one, and 
sometimes aS a zero. 


“The term "hard-core" will later also be applied to some equipment 
within this aroun subject to test, but in a different sense. 
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3. Deteriorated component response to inputs or weakened drive 
capability of outputs. 


The first failure mode is the one of greatest interest for the 
subject test design because such persistent failures result in solid 
errors susceptible to detection and isolation. 

The second failure mode, inputs or outputs stuck at an indeterminate 
voltage level, might lead to no error if properly interpreted, inter- 
mittent error if interpreted differently at different times, or solid 
error if consistently misinterpreted. An assumption which is often 
made in deriving a diagnostic scheme is to disallow the second failure 


6 Another way of stating this is to assume that logic fails to 


mode. 
one of the two logic levels, one or zero, and not to some intermediate 
level. The assumption can be validated by setting a voltage threshold 
above which results will be interpreted as one logical state, and below 
which results will be interpreted as the other logical state. The 
assumption of disallowing the second failure mode will be made for the 
test design.’ 

The third failure mode could result in solid or intermittent 
errors depending on the consistency of the erroneous results and the 
duration. For example, a weak driving capacitv of an outnut feeding 


several subsequent inputs (fan out) could result in some inputs receiv- 


ing a zero and others a one. This would be a solid error if the same 


For example, see Ref. 31. 


This assumption is occasionally not made. For examnle, one scheme 
which relies on circuitry which fails to a NULL state intermediate be- 
tween one and zero is described by Connolly and Schmitt [Ref. 8]. The 
assumption of failure to one or zero is far more common. 


inputs always received the same signal under the given conditions. An 
intermittent error would result if, for example, a driven input received 
a logical one in one instance and a zero in another for the same driving 
output value. The third failure mode is considered part of the test 
problem. It will be discussed again under the topic of marginal testing 
in Section IV. 

Intermittent errors should be discussed more fully, as they are 
sometimes part of the test problem and sometimes not. Some physical 
causes of intermittent errors are: 

1. Dirty connectors - a small smudge of oi] or dirt on a pin might 

be sufficient to intermittently block the low current levels 
typically found in intermodular lines. Vibration can provide 


Slight shifts in the contact surfaces sufficient to make or 
break contact. 


2. Temporary overheating of hardware regions - when not persistent, 
such transient environmental conditions can cause intermittent 
erroneous results. 


3. Loose connections or particles between circuits or within 
hardware packages - vibration can cause open and closed circuit 
conditions intermittently. 


4. Unusual electromagnetic interference (EMI) or coupling-spikes 
coupled into the circuitry from outside, or appearing through 
the power supply can cause changes jn state resulting 
in erroneous performance. 


5. Drifting characteristics - aqing or deteriorating components 

or changing environmental conditions can cause varying and 
Inconsistent performance changes in circuitry. 

While the above list is certainly not complete, it does serve to 
illustrate the many sources of intermittent error, and to suggest the 
difficulty of detecting and isolating the causes of such errors. Those 
causes not representing hardware failure, such as dirty connectors or 
unusual EMI, can cause erroneous results which falsely indite fault- 


free circuitry (which, when faulty, exhibits the same symptoms). Such 


causes of faulty performance are important because even one state 
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change affecting a logical decision within the machine can produce 
catastrophic results. While intermittent errors caused by other than 
hardware failure have been excluded from the test nroblem, test pro- 
cedures must endeavor to ensure they are, in fact, excluded. A proce- 
dure which signals hardware failure when none exists not only reduces 
the level of confidence accorded error signals, but also increases 
cost, in direct opposition to BIT objectives, by causing fault-free 
circuitry to be replaced. 

The degree, or extent, of failure is also important to test design. 
Single failures are inherently easier to detect and isolate than 
multiple failures; the detection problem is smaller. Additionally, 
multiple solid failures can have the property of occasionally masking 
each other, qiving the appearance of intermittent single failure. To 
reduce the test problem to reasonable limits, the assumption that there 
exists at most a single failure in a computer to be tested is often 
made. The validity of the "single failure assumption" will be examined 
relative to the example BIT design as a possible means of reducing the 
quantity of added hardware required to give sufficient test effectiveness 
within acceptable program bounds. 

The components used in modern military/space systems are designed 
to have high individual component reliability. Low power silicon 
transistors in the Raytheon equipment used in Apollo and Polaris pro- 
grams, for example, were found to have a failure rate of 1.4 x 107° 
failures/1000 hours Ref. 40]. If multicomponent packages such as IC's 
are used, the interconnections between components on the same silicon 


chip are more reliable than in the discrete component case. Overall 
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equipment reliability can therefore be expected to go up through the 
use of integrated circuits [Ref. 29]. Figures provided from a variety 
of aerospace suppliers 1964 to 1966 show failure rates for integrated 
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1000 hours [Ref. 19]. Brauer has reported integrated circuit failure 
rates varying from 7 x one failures/1000 hours to 6 x iKie failures/ 
1000 hours [Ref. 4]. Infant mortality failures and adolescent failures, 
usually occurring during burn-in and testing at the factory, exceed 
the exponential failures (constant failure rate) more common in an 
operationally deployed unit. This partially accounts for the diversity 
in the cited failure rates, and emphasizes the need to know failure 
rate sources and conditions for proper interpretation. The point to 
be made is that even the most pessimistic of the cited figures shows 
that a long operating life can be expected from modern components. 

The MTBF of a computer considers all the different component 
failure rates in addition to connection reliabilities and workmanship 
flaws in assigning a commonly used overall reliability figure of merit. 
The MTBF of the digital airborne computer can be expected to be in the 
1000's of hours .© With system MTBF's of this order of maqnitude, the 
probability of experiencing one failure in a short time interval is 
very small. Experiencing two or more failures in the same short time 
interval is highly improbable. It then seems reasonable that one incurs 


a very small risk of undetected error if one designs test techniques 


orhe Autonetics D26J airborne computer with an estimated MTBF of 
18,000 hrs; the Litton LC-728, 4,250 hrs; the Raytheon R-11, 3,500 hrs; 
the CDC 5400, 2,500 hrs are examples from marketing brochures. 
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assuming single failure, as long as testing is done at least neriodically 
at short intervals. This intuitive approach is used, as more exact 
calculations are dependent on actual failure rates, numbers and types of 
components, specified confidence levels and assumed distributions. The 
Single failure assumption seems to be justified for the example design, 
and will be made. Restated, the assumption asserts that the computer 

is constructed of highly reliable individual components so that essen- 
tially simultaneous failure of more than one component is so improbable 
that it can be reasonably nealected. The assumption is further 
justified economically by program limitations in that testing for 
multiple failures requires more added hardware at an unacceptable cost 
penalty. 

The foregoing examination of the nature of failure has led to some 
assumptions and conclusions relative to BIT design. First of all, 
logic will be assumed to fail to one of its two logic states, and not 
to some intermediate level. Solid failures will be of major interest; 
however, any failure leading to erroneous results is part of the detec- 
tion and isolation problem. Intermittent errors will be especially 
difficult to detect and isolate. Those erroneous results caused by 
non-hardware sources are important in that care must be taken to avoid 
condemning fault-free hardware as their source. Finally, the single 
error assumption will be made because little risk of undetected error 
is thereby incurred, and it presents the most reasonable approach from 
an economic standpoint. Now the possible test procedures available 


to meet the BIT objectives can be considered. 
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IV. TEST PROCEDURE ALTERNATIVES 


A. GENERAL CONSIDERATIONS 

Comparison forms the basis of all test procedures. A norm against 
which comparison can be made must be available, either a priori or as 
a result of some generating process. The computer then produces a 
result which is suspect until verified against the norm. The variety 
of procedures available for testing a computer have this comparative 
process in common. 

Since thorough testing for all possible errors within the test area 
of interest is the objective, the different levels at which testing can 
be conducted should be identified. The computer can be functionally 
exercised by directing it to perform the operations for which it was 
designed on a variety of operands. The thoroughness of test can be 
evaluated by asking how many of the possible machine states are thereby 
verified. The totality of the possible combinations of inputs and 
outputs of the machine's logic circuits form the set of machine states. 
A gross functional check performed by exercising the computer's instruc- 
tion set on a few operands can be seen to be less efficient and comnlete 
in verifying proper operation of all circuitry than comprehensive 
application of the set of inputs with comparison of resulting outputs 
against the set of unfailed machine output states. The one test method 
is superficial while the other is unnecessarily exhaustive. Each has 
been termed "100% testing" by industrial marketeers. The percentage 
of testing for this investigation will refer to the percentage of 


possible errors for which checking has been performed. The former method 
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mentioned above would nrobablv vield a low nercentane while the latter 
would renresent testing in excess of 100%. The closer to the lonic 
level that testing is directed, the more thorouch testina becomes. Se- 
lective testina at the logic level can be most efficient in identifvina 
all the failures of interest. 

Not onlv must test nrocedures check for all nossible failures of 
interest, thev must also take care to avoid sianallina error when none 
exists, as alluded to in Section III in the case of non-hardware-caused 
intermittent error. Testina which is not thorouch leads to invalidation 
of the sinale failure assumntion since some failures can ao undetected. 
Nn the other hand, inannropriate error sianals "crvina wolf" can cause 
the pilot to take unnecessarv abnormal action detrimental to mission 
comnletion. A sionificant advantage to testina conducted in the air- 
borne environment is that not all errors identified airborne would be 
found if qaround testina nrocedures were used instead. Consequently, 
around maintenance nersonnel must have a hich dearee of confidence in 
airborne error indications since around verification may be imnossible. 
If a throwaway maintenance concent is in effect, anod modules miaht be 
discarded because of inaccurate test results. 

Detection of error is only one nart of the test nroblem. Isolation 
of the causative failure is the other. Test nrocedures differ in their 
abilitv to nrovide fault isolation. Early test nrocedures were desianed 
to nroduce isolation to the sinale comnonent level (if isolation was 
nrovided at all) since machines were constructed with discrete technol- 
ony. The multicomnonent nackace of the hiaher level technoloaies has 


made unnecessarv such fine resolution nrccedures. For the examnle BIT 
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design, the modular level is the level of interest. One is not con- 
cerned where within the module a failure is located; whether or not the 
modular package as an entity is faulty is of primary interest. With 
these general comments as a background, the various ways of categorizing 


test procedures can be explored. 


B. PROCEDURE CATEGORIES 
1. Normal vs. Marginal 

Diagnosis of existing solid errors should be the first order of 
business for any test procedure. Prediction of possible future failures 
would be a desirable supplement to the preceding tests to locate exist- 
ing errors. The former testing will be termed "normal" testing while 
the latter is called "marginal" testing. Normal testing will be the 
type pursued in the example test design. However, marginal testing 
conducted in conjunction with normal testing is generally valuable in 
furthering test objectives. 

Intermittent errors cause one of the biggest problems to the 
test designer. However, an intermittent failure causing inconsistent 
results can often be forced to become a solid failure with a resul- 
tant solid error manifestation through marqinal testing techniques 
[Ref. 7]. Marginal testing tends to worsen the third failure mode 
discussed in Section III by further weakening already deteriorated 
components until they become solid failures of the more easily diag- 
nosed first failure mode. Marginal testing consists of overstressing 
components through the application of abnormal conditions to cause the 
weak ones to fail prematurely during test instead of later during 


normal operations. Stressing, for example, can consist of over or 
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under biasing transistors by a certain percentage of rated values. The 
danger of marginal testing is that existing intermittent failure can be 
masked by a rash of new failures should stressing be done carelessly 

or to needless extremes [ Ref. 3]. When done carefully, however, mar- 
ginal testing, in effect, nredicts future failures by forcina them to 
occur at non-critical times. It also serves to identify and rid the 
machine of bothersome intermittent failure, thusly increasing the 
degree of confidence accorded to airborne test results. 

Marginal testing is generally not appropriate airborne because 
of the time and extra equipment necessary to accomplish it. The 
accomplishment of marginal testing on the ground depends upon the 
maintenance concept. If periodic maintenance on the oround sunplements 
airborne built-in testing, marginal testing should be part of this 
periodic procedure. In the example design, where no airborne repair is 
done, marginal testing can be accomplished whenever the computer is 
removed from the aircraft for repair of a solid failure identified by 
Bale 

2. Software vs. Hardware 

Software testing refers to program-oriented, sequential lv 
executed, periodic testing. The computer is directed by a program to 
accomplish a series of operations on supplied data. The results of 
these operations are then interpreted to provide diagnostic informa- 
tion. Since software testing is program-oriented, the level of testing 
(and therefore, to a certain extent, the efficiency of testing) is 


determined by the level of the programming lanquage used. The lower the 


order of the programming language, the closer to the component level 
operations can be specified. Assembly language or its equivalent is 
most frequently used. 

A programmed test routine is sequentially executed, one instruc- 
tion after the next. The length of the program in number of instructions, 
the cycle time of the storage device containing the program, and the 
execution times of the instructions affect the time duration of the 
test. Test results can usually only be determined after a sequence of 
instructions has been executed and a result determined. This result 
is then compared against some previously calculated correct result to 
see if error has occurred. The same sequence of instructions might 
then be repeated with a different set of data and a different expected 
result. Comparison against the norm can take place automatically 
under program control after short sequences have been executed, or 
later upon examination of a printed output. 

Procedures for software testing differ widely. The detection 
and isolation functions can be accomplished concurrently or separately. 
In the separate case, an "executive" routine might be run periodically 
to determine in a gross sense whether or not the computer were exhibit- 
ing abnormal behavior. Once such behavior were sensed, a more detailed 
"diagnostic" routine might be run to determine the more exact location 
of the failure causing the error. Because of the limitations of the 
programming lanquage in closely manipulating suspicious components, 
results might localize the failure to a region of the machine. Techni- 
cians would then locate the failure by hand probing. Such procedures 
tend to be inefficient, marginally effective, and always time- 


consuming. 
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The characteristics of software testing can be evaluated with 
regard to the BIT objectives. A definite advantage is that software 
testing requires little added hardware (other than storage) to accomnlish 
the checking function. Isolation after detection is difficult because 
of the periodic nature of testing. The test program typically occunies 
core memory (unless slower peripherals are available for temnorary 
storage) and requires significant running time if many different test 
data are to be used in an attemnt to make testing more comprehensive. 
Some functional degradation would occur when time is scarce, even 
when the test program is run on a periodic basis, because testing must 
share available time with the operational flight program execution. 

On the other hand, the shorter the test program and the lonaer the inter- 
val between tests, the greater the danger of using erroneous results 

of undetected failure and downgrading test efficacy by invalidating the 
single failure assumption. Test results are only known after several 
onerations have been executed. This presupposes that the machine has 
not failed to the extent that it cannot execute instructions and give 
results necessary to locate the failure. Intermittent failure would 
tend not to be detected by software testing, eliminating the problem 

of signalling error and indicating failure when none exists. On 
balance, software testing did not look generally attractive for the 
example design. 

Hardware testing refers to checking accomplished by added 
circuitry. Such testing is characterized by simultaneous detection 
and isolation usually at the logic level, rapidly available results, 


and minimal deqradation of operational canability. In general, the 
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added checking hardware generates a basis for comparison with concurrent- 
ly generated flight program results, and actually accomplishes the com- 
parison at the logic level. Operation at the logic level provides 
excellent fault isolation capability. Results of the comparison are 
known essentially immediately. If a fault exists, it can be located and 
appropriate action taken prior to contamination of other data, or 
utilization of erroneous results. Hardware testing differs from soft- 
ware testing in that it checks the correct operation of the circuit 
being tested, but does not verify the correctness of the data being 
operated upon. The effect is that each circuit in a chain must be so 
checked if resultant data is to be certified. Further discussion of 
concurrent testing, characteristic of hardware testing, will. be pre- 
sented in the next subsection. 

By virtue of consisting of fewer components, checking circuitry 
is inherently more reliable as a whole than the logic it checks. How- 
ever, the components themselves are just as subject to failure as the 
components they test. To provide a high confidence of valid testing, 
therefore, one must consider the added test hardware itself as a noten- 
tial source of failure. Such hardware then becomes hard-core in the 
sense that its proper functioning must be verified before testing 
commences. Unlike the hard-core housekeeping and service hardware 
previously mentioned, checking hardware was considered to be part 
of the test problem. 

Hardware testing offered many benefits making it attractive 
as a means of meeting the example design objectives within program 


constraints. Its obvious disadvantage relative to software test was 


the much higher cost nenalty incurred as a result of the exnense of 
added hardware. A combination of hardware test to nrovide efficient 
test performance and software test to reduce expense offered a nossible 
tradeoff for the example design. 


3. Continuous vs. Periodic 
Testing can be classified by its duration as either continuous 


or periodic. Continuous testing must also be concurrent (the results 
of test may be somewhat time-skewed) since ongoing onerational compu- 
tations occur simultaneously. Continuous testing is characteristic of 
hardware test. The effectively immediate failure detection provided 
by continuous testing tends to identify intermittent errors, where 
periodic testing does not. The single failure assumption is justified 
Since failures are detected as soon as they occur. Onerations can be 
halted upon occurrence of an error and the machine state at time of 
halt preserved. The process of "retry" or "restart" then attempts the 
last oneration again to see if the same error recurs. Recurrence 
indicates a solid error and failure is flagged. Non-recurrence denotes 
an intermittent error, in which case the second correct attempt is used 
and operation continued. By noting the recurrence rate of intermittent 
error under the same conditions, intermittent hardware failure can often 
be distinquished from one-shot external sources. Hard-core house- 
keeping and service hardware is generally continuously tested. 

Periodic test refers to checking conducted at svecific intervals, 
such as software testing. The testing then time-shares with onerational 
computation. Results are only determined after a number of sequential 


Steps have been accomplished. Preservation of machine status for retry 
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when periodic testing detects an error is generally not practical. 
However, if the periodicity of test is sufficiently brief, error halt 
can occur shortly after failure, minimizing the cumulative effect of 
error on post-failure computation. The single failure assumption is 
still valid if the period between tests is short. Intermittent errors 
will not be detected by periodic testing until they become solid. Even 
in a continuously tested machine, hard-core checking circuitry is more 
reasonably tested neriodically. 

The unique nature of the added checking hardware providing 
continuous concurrent testing to the different logic circuits of the 
machine results in high cost. A tradeoff in favor of a periodic, 
interruptable test procedure exercised at frequent intervals appeared 
attractive for the example design. 

4. Deterministic vs. Non-Deterministic 

A deterministic test yields a definite answer to the question 
of whether or not an error exists. A non-deterministic test yields 
results which are interpreted statistically against an expected dis- 
tribution to determine the probability of the existence of error. The 
terms are more often applied in relation to software testing vrocedures 
Since hardware testing is always deterministic. Non-deterministic 
testing was not attractive for the example BIT desiqn because of the 
requirement for a high degree of confidence in test results. Sta- 
tistical techniques were, however, found useful in selecting initial- 


iZing data. 
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9. Combinatorial vs. Sequential 


Seshu and Freeman (Ref. 45] classify the organization of | 


testing into two different categories, combinatorial and sequential. 
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A combinatorial testing procedure involves application of a fixed set 

of inputs to the machine with the outnut results being analyzed to 
identify failures. As an examnle, non-deterministic testing is combina- 
torial. A sequential procedure has no fixed set of tests which are 
anplied. The result of the first test sequence determines which test 
Sequence will be used next. Sequential testing is more efficient since 
selection leads to fewer tests. These two categories should not be 
confused with the often used classification of logic as combinatorial 
(combinational) or sequential. Combinatorial and sequential testina 
procedures clearly refer to classes of software testing and not to 


concurrent hardware test. 


C. ALTERNATIVES 


1. General 
The previous section presented several categories which can be 


used to describe test procedures. In practice, the specific proce- 
dures presented in the literature tend to fall simultaneously into 
several of the categories previously mentioned; all are a blend of 
alternate approaches having favorable characteristics relative to their 
intended applications. The discussion of snecific alternatives re- 
quires a further cataloging effort, difficult because of the diversity 
of approaches to test and because of the aforementioned overlanping of 
categories. The discussion presented is not intended to be comprehen- 
Sive; it is meant to demonstrate the diversity existing in the test 
field and to introduce some techniques which proved useful in devel- 
oping the specific blend of approaches best meeting the requirements of 


the example desian. 


Since most of the test alternatives identified have been 
presented in the literature, the discussions are usually short, rapidly 
settling to a single level of interest. Some discuss the systems 
approach, giving overall techniques for testing the computer's different 
major units. Others have develoved schemes for determining the ontimal 
test sequences for checking one unit of the computer (e.q., the arith- 
metic unit, or the memory). Such schemes examine the states of the 
elements comprising the unit under consideration, the elements being 
identified as either fault-free or failed, and develop tests to yield 
the final diagnostic results on the entire unit. Still other techniques 
examine the states of the inputs and outputs of a single logic element, 
or block of elements (e.g., an AND gate or a multiplier block), with 
the goal of locating a failed element. The presentation of alterna- 
tives below will generally move from the system level to the loqic- 
block level; however, the tyning is loosely defined and often diffi- 
Cullis 

2. Coding 

A large variety of schemes and a significant body of theory have 
been developed in the literature relative to coding test techniques. 
Generally, coding represents a succinct way of supplying redundant 
information to provide a norm for comparison. Codes can be used to 
detect and correct single or multiple errors. The program constraints 
imposed on the example BIT design eliminate from consideration error- 
correcting codes and those requiring core memory storage. For this 
reason, only parity was considered potentially annlicable for the 


examnle design. Its nature and possible use will be discussed next. 
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Parity is the simplest error-detecting code consisting of one 
redundant bit of information, making the sum of the information bits 


nlus the parity bit either even or odd as desired. For a binary number 


where a. is the binary value for the ith bit location, parity P(N) can 
be expressed as 


mod 2 
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The correct parity value for a data word is known a priori. Upon 
completion of an operation, the correct parity of the result is known 
and is generally attached to the result as an additional bit. The 
actual parity is then calculated and comnared to the exnected parity to 
determine whether or not error has occurred. 

Parity has the capability of detecting odd numbers of errors, 
and therefore provides protection beyond the sinale error assumed. In 
the absence of the single error assumption, the risk of undetected multi- 


ple even errors can be calculated. Given ann bit word 


resulting from operations, the binary value a. of the ith bit nosition 
can have one of two states relative to failure (failure states): it 

is either correct or erroneous. The probability of undetected error fF i 
1s just the sum of the probabilities of multiple even errors. Assuming 


an instantaneous probability of error p in bit location i and independence 
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between bit locations, the probability of k simultaneous errors is 

just of. Accounting for all combinations of ways k errors can occur 

in an n-bit word length (n even), the instantaneous probability of unde- 
tected error can be expressed as 
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2.7 x 10°, or .027%, a very low risk. 

Parity can be useful in both software and hardware test pro- 
cedures. It is often used to detect single errors in data transmissions. 
For the example design its potential use was as a hardware test where 
the correct parity was automatically present, or generated by the cir- 
cuitry to be checked. A hardware parity generator and comparator could 
then be added to vrovide error indication. An examnle anplication 
might be to a feedback shift register which always generates a number 
with odd parity to which a parity generator and comparator could be 
added to verify proper operation. The generation and use of parity for 
comparison was only acceptable for the example design where core memorv 
storage of parity bits was not required. 

3. Diagnostic Partitioning 

The general technique of diagnostic nartitioning divides the 
computer into smaller entities, each of which can then be tested 
separately. Forbes, Rutherford, and Steiglitz [Ref. 13] present such 


a technique in which the computer is partitioned into "diagnostic 
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subsystems," each having certain capabilities. The subsystem essen- 
tially is able to apply stimuli, sequentially execute a series of oper- 
ations, receive and process inputs, and communicate diagnostic results 
of test to the outside world. . The subsystems can then alternately 
diagnose each other. A sequence for system diagnosis at the subsystem 
level is developed. Their technique of partitioning a machine into 
essentially autonomous sections was found to be applicable in the exam- 
ple design. The test technique involves a periodic, software test with 
fault isolation provided by the order of onerations. An interesting 
feature is the microprogramming of the test routine to provide closer 
manipulation of the logic for the reasons previously described in 
Section IV-A. 

The concept of diagnostic partitioning can be anplied to a 
partitionable machine in a "bootstrap" fashion. One subsection is 
considered to be hard-core, and it is checked by hardware means, 
manually, or by software. An example of software test would be execu- 
tion of a small number of operations requiring only the hard-core sub- 
Section to implement. Upon verification of the hard-core subsection, 
one then uses it to check the next subsection. The two checked sub- 
sections can then be used to check the next and so forth. This repre- 
sents a type of sequential testing (vice combinatorial) at the 
subsystem level. Manning [Refs. 31 and 32] describes a modification 
of such a technique. The difficulty with diagnostic nartitioning is 
that the architectural designs of many computers do not facilitate 


partitioning. 
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4. Program Hierarchy Testing 


A system technique related to diagnostic partitioning examines 
the functional capabilities of the computer. A hierarchy of distinct 
software programs 1s used to functionally partition the machine, in 
contrast to the physical sectioning associated with the diagnostic 
partitioning of the previous section. A high level proaram periodically 
functionally tests the computer by exercising short routines using the 
machine instructions to grossly check the computer for proper oper- 
ation. Examples of functional checks might be adding, multiplying or 
shifting. Such "executive programs" are not intended to be comprehen- 
Sive or isolating; they detect errors in functions by comparing results 
obtained to previously stored expected results. Once an error has been 
identified, a “diagnostic routine" tailored to the type of functional 
error detected is executed to provide the isolation required for repair. 
While not comprehensive, such a technique allows frequent running of 
the short executive routine, while calling on the longer diagnostic 
routine only when error is sensed. Cohen and Whitaker [Ref. 7] describe 
such a procedure developed at Sylvania. Bashkow, Friets, and Karson 
[Ref. 3] divide the diagnostic process by hierarchy into a command 
checkout phase, used to assure that the machine is "breathing" (no 
gross malfunctions exist), and "executive", "testing", and "diag- 
nostic" phases to give more detailed checking at lower levels. The 
diagnostic programs used are microprogrammed to provide failure 
resolving capability. 

5. Software Exercise, Hardware Detection 
An interesting combination of testing techniques uses software 


routines to exercise the computer periodically and added hardware 
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circuitry to detect errors. The hardware provides the level of detec- 
tion resolution required. Software routines need only thoroughly exer- 
cise the machine, with no attention to order of execution for isolation 
being necessary. Fred Lee [ Ref. 27] describes such a procedure in 
which the machine's operations are broken down into sequences of events, 
recognizable as pulses occurring in a specific order. The correct 
sequence is provided for the test routine and is compared against the 
actual sequence. Hardware monitoring devices provide the comparative 
function with non-coincidence signalling specific error. With an 18.2% 
increase in transistor count for test purposes, Lee claims 100% confi- 
dence in the device. This procedure is also described by Sellers, 
Hsiao and Bearnson [Ref. 43] under the title of "sequential logic 
latch checking." While Lee's procedure was not used, the idea of 
software exercising and hardware detection was of use for the example 
design. 
6. The Black-Box Approach 

The black-box approach refers to the process of setting the 
inputs of a network and observing the resultant outputs, useful in- 
formation thereby being derived without internal access to the net- 
work. A most extensive body of literature reports on varvina 
Schemes to obtain optimal, minimal sets of inputs to diagnose all 
possible errors internal to the network. With the growing use of 
multicomponent nackages inaccessible internally (IC, MSI, and LSI 
technology), this test area has received renewed attention. Eldred 
[Ref. 12], in one of the earlier papers treating the black-box 


approach, discussed the derivation of minimal tests for a simple 
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network of discrete components by evaluating the input conditions which 
should cause the network output to be "activated" or "inhibited." 
Results deviating from this norm indicated failure. Armstrong [ Ref. 1] 
presented a procedure based on "path sensitizing" in which a given 
internal fault is selected and its effect is traced to the outout for 
given input conditions. The procedure continues until all faults have 
been treated and the significant input and output patterns derived. 

The "truth table" or fault dictionary technique is similar in that a 
table of the expected outputs for given inputs and specified internal 
failures is derived. Comparison of combinatorial test results to the 
fault dictionary determines if an error has occurred, and where. 

The derivation procedure for a large block of logic can be 
tedious, even when computer aid 1s used. The requirements for memory 
can easily exceed availability in the analysis of large networks. Such 
difficulties have led to the development of simplifying methods for 
automating the analysis of large networks. There is wide agreement in 
the literature that the derivation of minimal input tests for a large 
block of logic must be automated. 

Sellers, Hsiao and Bearnson [Ref. 42] developed an algebraic 
technique based on Boolean difference to facilitate learning the effect 
of a change in state of a chosen input on the network output. The 
procedure involves logically Exclusive-ORing the Boolean output func- 
tion, expressed in terms of the inputs, with the same function having 
the chosen input inverted. If the Boolean output function is 

F (X15 Xog vey Xe , x.) 
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where xX. are the inputs, for the system 
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they define the Boolean difference as 


F(x), Kos sees Xeg cons xn) V F(x), Xe very Xe 


ih) 


lege n 
where the chosen input of interest Xs is inverted in the second ex- 
pression and V represents the Exclusive-OR onerator. The Boolean 
difference yields the input conditions for which the output will change 
state, given the chosen input state change. 

Roth [Ref. 41] with his calculus of D-cubes exnands on the 
above method, but with a more graphical technique to solve the some- 
times formidable problem of accomplishing algebraic operations such 
as V for complex functions. He first expresses the truth table of each 
element of the network in a succinct form and then gives rules for 
intersecting the tables of the individual elements to form the table 
describing the entire network. 

The usefulness of such techniques is reported by Galey, Norby 
and Roth [Ref. 14] in an earlier version of Roth's later technique. 
Four eight-bit input tests were automatically derived, the results 
of which would indicate whether any one of 102 possible internal 
failures had occurred (but not which one). This illustrates the con- 
cept of testing an internally inaccessible network for failure 


without interest in which specific component has failed. 
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An interesting contrast is offered by Maling and Allen [Ref. 30] 
who test a network for failure with the purpose of identifying the 
specific failed component. For each n-input component of the logic 
net, De represents the number of different input combinations. Only n 
+ ] of these are necessary to show that each input in turn can control 
the output and that the output can take either state. For a net of k 
such components where the ith component has n. inputs, they state that 
the number of configurations C of the n + 1 required inputs per compon- 


ent is 


This number also represents the maximum number of tests required to 
thoroughly check the circuit with component isolation. The lower 
bound is determined if each test is efficient enough to eliminate half 
the components from further consideration. The minimum number of tests 
Te is then 
a |1og, C | 
where | | indicates next higher integer. From experience, they state 
that the number of tests required is usually anproximately equal to the 
number of components. 
7. Non-Duplicative Hardware Checking 

Checking by adding hardware which does not duplicate the cir- 
cuitry being checked provides the benefits of hardware test without 
the cost of duplication. Rao [Ref. 39] describes a method for checking 


arithmetic-type operations in a processor through the use of residue 


coding generated and employed by added hardware without storage to 
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identify errors but not to locate them. The residue code was used to 
provide a high level of multiple-error checking capability not required 
in the example design. The 1000 gate processor required 490 added gates 
to check it, or a 40% increase in cost which would be unacceptably high 
for the example design. Sellers, Hsiao and Bearnson have comniled a 
comprehensive volume [Ref. 43] on error detecting logic, which is the 
only one of its kind identified by the author. The cited reference 
1s an excellent source of non-duplicative hardware checking schemes. 
The use of non-duplicative hardware schemes anneared attractive for 
the example design, particularly for the hard-core circuitry included 
in the test problem. 
8. Replication and Comparison 

When other schemes do not provide adequate checking, one can 
replicate circuitry, operate the replicated portions in parallel and 
compare the results, with any non-coincidence indicating error. While 
the technique is expensive (and unacceptable for the example design) 
when employed on a large scale, it often presents the only technique 
by which isolated small blocks of circuitry, or highly irreqular cir- 
cuitry can be thoroughly checked. For the examnle design, duplication 
of small sections was very useful. The replicate and compare concept 
is often applied when high reliability requirements force the use of 
redundant hardware on a large scale. Switching to the unfailed dunli- 
cate offers continued operation while the failed portion is renaired. 
Automatic repair is not appropriate to this investigation, yet it 
proceeds naturally from some of the methods found useful and there- 


fore represents a good topic for further related investiqation 


An interesting contrast is offered by Maling and Allen [Ref. 30] 
who test a network for failure with the purpose of identifying the 
specific failed component. For each n-input component of the logic 
net, 2” represents the number of different input combinations. Only n 
+ 1 of these are necessary to show that each input in turn can control 
the output and that the output can take either state. For a net of k 
such components where the ith component has n. inputs, they state that 
the number of configurations C of the n + 1 required inputs per compon- 


ent 15S 


This number also represents the maximum number of tests required to 
thoroughly check the circuit with component isolation. The lower 
bound is determined if each test is efficient enough to eliminate half 
the components from further consideration. The minimum number of tests 
ear is then 
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that the number of tests required is usually anproximately equal to the 
number of components. 
7. Non-Duplicative Hardware Checking 

Checking by adding hardware which does not duplicate the cir- 
cuitry being checked provides the benefits of hardware test without 
the cost of duplication. Rao [Ref. 39] describes a method for checking 


arithmetic-type operations in a processor through the use of residue 


coding generated and employed by added hardware without storage to 
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identify errors but not to locate them. The residue code was used to 
provide a high level of multiple-error checking capability not required 
in the example design. The 1000 gate processor required 490 added gates 
to check it, or a 40% increase in cost which would be unacceptably high 
for the example design. Sellers, Hsiao and Bearnson have comniled a 
comprehensive volume [Ref. 43] on error detecting logic, which is the 
only one of its kind identified by the author. The cited reference 
1s an excellent source of non-duplicative hardware checking schemes. 
The use of non-duplicative hardware schemes anneared attractive for 
the example design, particularly for the hard-core circuitry included 
in the test problem. 
8. Replication and Comparison 

When other schemes do not provide adequate checking, one can 
replicate circuitry, operate the replicated portions in parallel and 
compare the results, with any non-coincidence indicating error. While 
the technique is expensive (and unacceptable for the example design) 
when employed on a large scale, it often presents the only technique 
by which isolated small blocks of circuitry, or highly irregular cir- 
cuitry can be thoroughly checked. For the examnle design, duplication 
of small sections was very useful. The replicate and compare concept 
is often applied when high reliability requirements force the use of 
redundant hardware on a large scale. Switching to the unfailed dunli- 
cate offers continued operation while the failed portion is renaired. 
Automatic repair is not appropriate to this investigation, yet it 
proceeds naturally from some of the methods found useful and there- 


fore represents a good topic for further related investiqation 


by others. Duplication and comnarison, recognized as one of the most 
effective test techniques, formed the basis for a unique application 
in the example design of the diagnostic partitioning scheme described 
earlier. 
9. Probabalistic Method 

A non-deterministic method which is periodic and combinatorial 
is presented by Merwin [Ref. 33]. A block of combinatorial logic 
(vice sequential logic having feedback paths, not to be confused with 
combinatorial test) having many inputs is tested by first establishing 
the expected distribution of output values. Each of the possible 
combinations of input values is considered equally likely. The output 
pattern resulting from each input pattern is derived. The statistical 
appearance of a given logical value at each specific outnut of the outnut 
set can then be determined. For example, if there are 16 possible input 
combinations (four inputs) and three outputs, output number two mav 
have the value logical one for eight of the input combinations. The 
logical value one would then be expected 8/16 or 1/2 of the time at 
output number two. Merwin attaches a random number generator to the 
inputs and tabulates the incidence of appearance of the logical value 
one at each of the outputs. Deviation of the actual ratios from the 
expected ratios may siqnify an error. If output two took the value 
logical one only 1/16 of the the time instead of the expected 1/2 of 
the time, error would be likely. Decision criteria can be established 
using statistical procedures. The random number generator as a source 


of random bit patterns was useful in the example design. 
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Vo SUHE EXAMPLE DESReN 


ApeiHeeTEST CONCEPT 
The parent comnuter was divided into units: 


1. The processor unit - containina arithmetic loaic and 
aeneral purpose reaisters. 


2. The control unit - to provide control signals for direction 
of onerattons in the processor unit. 


3. The core memory unit - to nrovide storane of the fliaht 
program and temnorarv data. 


4. The input/output (1/0) unit - to nrovide interface between 
the computer and the equipment it serves. 


The I/0 unit wil] not be constdered in the present investiaation. 

The pronosed instruction set for the computer (to be termed the 
macro-instruction set) provided for an extensive half-word/ half- 
reaister addresstna and manipulation capability. Processina was to 
be possible on 24-bit words (full-word operations), on the riaht or 
left 12-bits of the 24-bit word separately (senarate half-word onera- 
tions). or on the riaht and left 12-bits of the 24-bit word simu] taneous ly 
(parallel half-word operations). With little added hardware and desion 
effort, it appeared nossible to configure the hiahlyv reaular loaic 
of the nrocessor unit into two autonomous halves, each nossessing multi- 
functional canabilities. This diaanostic partitionina in effect 
provided a dunlex redundant processor unit without the exnense of 
duplicatina the hardware. This technique will be termed "split 
duplication." 

With the pronosed hiah speed of the parent computer, sufficient 


time was available when the machine was not nerformina its basic 
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operational mission (idle time) to time-share a periodically exercised 
test procedure without imposing any functional degradation. This would 
be particularly true if the test procedure, once initiated, could be 
interrupted to return the computer to operational computation without 
destroying test efficacy. Idle time was to be available every few 
seconds, validating the single failure assumption through short pverio- 
dicitv of test. The lower cost advantage of periodic, program-oriented 
testing could be thereby enjoyed. 

Two modes of operation were identified. In "normal" mode operation, 
denoting mission operational computations, both halves of the computer 
would be used together, making full-word, separate half-word, or 
parallel half-word operations possible. In "test" mode, denoting idle- 
time test exercising, only parallel half-word operations would be 
possible. During test mode, the autonomous processor halves would be 
loaded with identical half-word bit patterns. Identical parallel over- 
ations would then be executed on the like data independently. Comparison 
of the results would then be accomolished with non-coincidence of the 
two halves indicating error. The advantages of the sunerior dunlication 
and compare method could be enjoyed without the cost disadvantage of 
duplicated hardware. 

The source of data words with which to initialize the two processor 
halves during test mode remained to be resolved since core storaqe was 
not acceptable. The possibility of using an inexnensive hardware 
pseudo-random number generator, similar to the one used in Merwin's 
probabilistic method, appeared to be an attractive option which was 


compatible with the concent of interruptable test while requiring no 


core storage. Random patterns would more nearly simulate inputs used 
during normal mode operation. An argument can be made for "worst-case" 
testing in which a small number of unusual bit patterns not normally 
encountered in normal mode operation are used to stress the machine in 
a worst-case manner. Such stressing appeared to be more appropriate 
for marginal testing on the ground when such worst-case patterns might 
be expected to hasten impending failure. Additionally, no "“end-of- 
test" point needed to be identified since the machine was to revert to 
test mode at any time not required for normal mode operation. Finally, 
the storage required for worst-case bit patterns obviated their further 
consideration. 

The use of a pseudo-random number generator allowed the core memory 
unit to be disconnected from the processor unit during test mode, and 
made possible the core memory unit's separate checking either concur- 
rently, prior to, or subsequent to nrocessor unit test. The control 
unit, however, was required in test mode to supply the control signals 
to direct the parallel half-word operations. Testing of the control 
unit itself, and the location and execution of the exercising test 
routine still needed resolution. 

The issuance of accurate control signals by the control unit to the 
processor unit is a prerequisite to correct computation. The control 
unit was to be microprogrammed using a read-only-memory (ROM) as the 
storage device. The control signals apnronriate for executing the 
macro-instruction set were to be hard-wired in the form of short 
routines of the lower order micro-instructions. The hard-wiring 


consisted of arrays of transistors implemented on a small number of 


Silicon chips, the whole comprising the ROM. The remainder of the 
control unit consisted of the selection and sequencing circuitry required 
to assure issuance of the proper sianals in a timely manner. 

Because of the standard packaged arrays available with which to 
implement the ROM (the low risk nature of the program dictated use of 
off-the-shelf hardware), sufficient unusued storage capacity beyond the 
requirements for the microprogrammed control siqnals was present to allow 
storage of a microprogrammed test routine. Careful, efficient micro- 
programming of the test sequences promised a much shorter test routine 
requiring significantly less ROM storage than the comparable core 
memory storage needed for an equivalent routine programmed using the 
macro-instruction set. The inherent advantage of the lower order 
micro-instruction set relative to thorough exercise of the computer at 
the logic level is enjoyed by such a scheme. An additional significant 
advantage for an interruptable, time-shared test routine is the much 
Shorter cycle time of the read-only-memory compared to the core memory.” 
Note should be made here that test mode exercise of the processor unit 
could be accomplished entirely independent of the core memory unit. 

Since the control unit was to issue the control sianals directing 
the test routine, it became hard-core hardware whose proper function- 
ing had to be continuously assured. Hardware techniques for continuous, 


concurrent testing of the control unit were therefore essential to the 


J typical core memory cycle time is 2 usec while a typical ROM 
cycle time is 200 nsec, 10 times faster. 


concurrent testing of the control unit were therefore essential to the 
test concept. As will become evident when the control unit BIT design 
is discussed, the highly irregular nature of control unit circuitry 
tends to necessitate hardware test techniques in any case. 

With the test concept developed, the more detailed BIT desian of 


each unit can now be examined, 


B. THE PROCESSOR UNIT 

With the exception of the power supply, considered to be hard- 
core servicing hardware excluded from the test problem, no hard-core 
hardware requiring continuous test was to be located in the processor 
unit. The split duplication, periodic technique of testing the pro- 
cessor unit could be expected to thoroughly check its operation. 

The contents of the general processor module resulting from 
partitioning the processor unit are shown in Figure 2. Figure 3a shows 
the 24-bit data path divided into four-bit groups, with the double line 
denoting the left and right half-word division. Two four-bit qrouns 
L. and R., are physically located on the same module, oroviding eight 
bits of the 24-bit wide data path. The remaining groups are likewise 
associated on separate modules, a total of three identical modules 
(see Figure 2) being necessary to implement a 24-bit path. Modifi- 
cation of word length in eight bit increments is possible, in conson- 
ance with the objective of flexibility of word length. For examnle, 
addition of a fourth identical module would easily convert the pro- 
cessor to a 32-bit path width. 

Emphasis should be placed on the fact that the description above 


refers to a data path, and not to a single register or a single 
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functional circuit. The amount of hardware implemented on the 2A NAFI 
module is dictated by {ts area and ptn limitations, discussed in Section 
II-B-2. The entire processor unit can then be thought of as consistina 
of a series of three-module sets, the modules within each set being 
identical. The total number of modules tn the nrocessor unit would be 

a multiple of three. 

Providing isolation to the modular level has only been briefly 
discussed so far. Identical half-word bit patterns are used to initial- 
ize the nrocessor circuitry being tested. ‘hile the comnuter is in 
test mode, these bit natterns undergo parallel onerations concurrently 
in the autonomous halves. The results of such onerations should there- 
fore be identical at each point in the data nath. Any difference 
indicates that a fault exists. Non-coincidence is sianalled by a 
hardware comnarator placed in each module to comnare the autonomous 
halves' results. The required fault detection and isolation are hence 
achieved by the placement of the comnarators in the data path at the 
modular level. Comparison takes place continuously durina test mode at 
each clock nulse, so interruntion to return to normal mode operation 
has no effect on test efficacy. 

The decentralized nower sunnly located in each module consisted 
of the final sten of reaulation reauired to nrovide the nower level or 
levels necessary in the module. The decodina of the control sianals 
was also accomnlished in the associated module. Decode could thereby 
be checked by the same technique as other processor hardware, elimina- 
tina the necessity for the more difficult, costly continuous checkina 


of decode circuitry located in the control unit. Any failure in 


52 


the power supply serving the module or in the decoding function would 
occur in the module. By tying the hard-core checking circuitry for 
testing the local power supply (not treated herein) into the processor 
module checking circuitry, a single error signal could be issued from 
the module in case of failure. For the examnle design, the reason for 
failure within the module did not need to be identified; only isolation 
to the modular level was required. If a centralized power supnly 
provided fine power regulation and if decode were located outside the 
module served, precautions would be necessary to insure that failures 
in these functions did not cause failure within the module to be errone- 
ously sianalled. Confidence in the error signal once issued is 
increased by the decentralizing scheme described. 

In test mode, only parallel half-word operations are accomplished. 
In normal mode, however, full-word and separate half-word operations 
are also utilized. Differences in the execution of operations in the 
two modes had to be identified to ensure that test procedures thoroughly 
exercised the circuitry, and that test hardware did not degrade normal 
mode operation. The carry forward fiound in adders, shift registers and 
counters in the processor unit was the major such difference. 

Figures 3b and 3c show the carries associated with parallel half- 
word and with full-word operations, respectively. In the case of 
parallel half-word operations, the carries between adjacent four-bit 
groups in the two halves are identical. For example, the carries from 
i vOmiemonderrem R.~ to R., are the same. Since the L, and Ry grouns are 


] ‘E ] 
located in the same module, the carries from the most significant ends 


of L, and R, are identical when no fault exists. These carries can 
then be compared, with non-coincidence indicating a failure in that 
module. 

One difficulty arises during test mode parallel half-word opnerations 


when an error in a carry 1s detected; e.g., the carry from L, to L 


] 2 
differs from the one from Ry to Ro. Error is signalled in the current 
module. The differing carries, however, cause the bit contents of L, 
and Ro in the next module to differ, and because they don't compare, 
Evrovers diso signalled im che next module. = ins ditt 1culltysean be 
resolved by inhibiting the error signal in module i+] when an error 
Signal is issued from module i preceding it. 

Another difficulty arises because during full-word operations in 
normal mode, the bit contents of the groups L. and R. in the same module 
may differ with no faults existing. Likewise, the carries propagated 
from L. to Lewy and from R. to Rea may also differ. The error 
Signal due to non-coincidence must only be allowed in test mode, in 
which any non-coincidence is the result of failure. A test-enable 
Signal can be applied to checking circuitry in test mode. 

It was also desirable to eliminate any gating from the inter-modular 
carry paths to avoid propagation delays. Figure 4a shows the checker 
circuitry added to each module. Figures 4b and 4c show possible loaic 
implementations of the desired truth tables for the carry checker and 
error-inhibit respectively. Figure 5 shows the relationships between 
two adjacent modules. 

Note should be made that the error inhibition in the case of the 


first difficulty discussed does not allow two adjacent modules to signal 
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error during the same periodic test iteration. Both carries from one 
module to another are also assumed not to fail simultaneously, in which 
case the comparison check would be passed in spite of existing failures. 
Both of these cases are highly improbable and represent nart of the unde- 
tected failure risk accepted under the single-failure assumption for 
built-in-test. In the case of simultaneous failures in adjacent modules, 
only one is signalled. However, upon checkout after repair or replace- 
ment of the signalling module, the second module would then immediately 
indicate failure. 

The fault-detecting circuitry described thus far does not distinguish 
between faults occurring in the module and faults occurring in the data 
transfer paths between that module and the previous one. Circuitry to 
provide such isolation could be added, and would consist of another 
comparator if the additional cost were acceptable. The problem of de- 
termining if the fault exists in the module or in the data transfer 
paths between that module and the preceding one would have to be accom- 
plished by ground maintenance personnel unless the additional comparator 
were incorporated. 

The pseudo-random number generator has been very briefly treated. 
Such a device is capable of providing long sequences of data words. 

A 12-bit generator was required for the example design. Golomb [Ref. 15] 
describes the design of a simple linear feedback shift register requiring 
very little hardware. An example generator which adequately fulfills the 
test requirements under consideration is included as Appendix A. The 
maximum length sequence of aM different patterns was obtained by 


implementing a modulo two irreducible polynomial found in Peterson 


[Ref. 36] and adding the nonlinearity of the important all-zero case. 
The patterns so obtained met Golomb's tests of randomness in each bit 
location. A self-checking pseudo-random number generator design 
using more hardware is illustrated by Sellers, Hsiao, and Bearnson 


[Ref. 43] under the title of “unit distance code parity checked counters." 


C. THE CONTROL UNIT 

The control unit was the least regular of the units to be self- 
tested. Additionally, it was hard-core, requiring continuous test to 
validate the control signals issued to the processor from the ROM. The 
split duplication test concept of neriodically exercising the processor 
unit during idle time presupposed a fault-free control unit able to 
issue appropriate control signals to direct test exercises whenever 
such idle time became available. Continuous testing of the control 
unit with added checking hardware would assure its fault-free avail- 
ability by signalling its unavailability upon occurrence of a failure. 
Partitioning the control unit to provide modular isolation of failure 
while minimizing the requirement for added hardware js the subject of 
this section. Since the control unit was the only unit requiring 
continuous test, it should be recognized that a large portion of the 
overall hardware penalty for providing BIT to the comnuter as a whole 
was to be paid in the control unit. 

Testing the control unit consisted of the following steps: 


] Testing the ROM for correct word content 
2. Testing proper accessing of the ROM 

3. Testing proper sequencing of accesses 
4 


Testing the checking hardware, which was also subject to 
failure. 


Testing the checking hardware was a problem common to all the units, 
and it will be treated in Section E below. Figure 6 shows the non- 
partitioned control unit organization for the parent comnuter. Figure 7 
illustrates the general modular partitioning and hardware added for 
checking, which is described below. 

Testing the ROM for proper word content will be examined first. 
The control signals used to properly execute the flight program (and 
the test routine) are stored in the ROM in the form of hard-wired bit 


patterns called microwords. '° 


The contents of the microword can change 
under failure, having a catastrophic effect on the control unit's ability 
to issue proper signals and consequently on the computer's ability to 
execute the flight program. The ROM, exclusive of addressing hardware, 
will be assumed to be implemented in segments of 256 eight-bit words, 
Shown in Figure 8, although this implementation is not critical to 
the test procedures described. The ROM microword length will be 
assumed to be 48 bits, also not critical. The ROM is then implemented 
in six segments, as illustrated in Figure 8. 

Three fields of the microword format (see Fiqure 8) have test 
Significance: 


1. Parity field (P) - one bit dedicated to parity of the entire 
microword in which it is located. 


2. Next address field (NA) - eight bits containing the next 
address in the microprogram sequence (the next microword to 
be executed). This field was necessary even without BIT. 


1Openending on the method of microprogramming, a microword may include 
several micro-instructions to control simultaneous onerations in 
the processor and elsewhere. A microprogram is executed one microword 
at a time. 


3. Current address field (CA) - eight bits containing the address 
of the microword in which it is located. 


The six segments comprising the ROM are checked for correct word 
content by narity. The addressing circuitrv accesses only one micro- 
word at a time. Parity is generated on the microword issued to the 
43-bit hold register. This generated varity is then compared to the 
proper parity stored in the parity field of the microword. Note should 
be made that the hold register and the ROM sense amnlifiers are also 
checked by this procedure. The functions of parity generation and 
comparison are combined in the parity checker shown in Fiaqure 7. 

The vartitioning indicated shows all addressing and decoding cir- 
cuitry in a module separate from the ROM storaae seaments, sense 
amplifiers and hold register. Divorcing the circuitry functionally 
related to addressing in this manner allows fault isolation to the 
modular level. This technique eliminates the ambiguity as to the mod- 
ular location of failure when a nortion of the addressing function is 
implemented in the same module as the ROM storage seaqments (a good 
example is the address decode, often provided on the same MSI chin 
as the storaae devices). 

The single-failure assumption made for the examnle design contends 
that the probability of multiple simultaneous failures in svstems 
composed of comnonents having inherent high comnonent reliability is 
So small that practical test design need not consider it. This assumn- 
tion was justified for discrete comnonents and even for IC's, but 
with the advent of MSI and LSI with their numerous closely-nacked 
comnonents, it must be reconsidered. In the context of the present 


subject, one must consider the higher pnrobabilitv of multinle failure 


caused, for examnle, by a cracked silicon chin where several adjacent 
components would be simultaneously affected. Odd varity, for instance, 
will not detect multiple even failures. The use of parity for ROM 
content checking annears to be justified by the fact that multiple 
failures would tend to affect more than one microword (to continue the 
example, a chip crack probably would not lie straight along the line 
of devices implementing a single microword). While one ROM access 
might not catch an even number of failures in one microword, very few 
subsequent accesses to different microwords would be necessary before 
a single or multiple odd failure would be detected and signalled. So, 
while the single failure assumption can be questioned for an MSI ROM 
imolementation, the use of parity can still be justified. 

Testing the addressing functions of the ROM is accomplished by 
comparing the current address field (CA) of the microword with the step 
counter contents. The step counter (or a second register if timina 
requires the step counter to change prior to the issuance of the micro- 
word being accessed) contains the address of the microword to which 
access is being attemoted. The eight bits of the CA field contain the 
address actuallv accessed. Comparison of the two indicates whether an 
addressing failure has occurred. The sten counter, decode and drivers 
are implemented on the same module. Non-comparison of the CA field 
and the step counter therefore signals an error in this address-func- 
tion module. If the parity cneck in the ROM storage module fails, 
indicating incorrect microword content or a failed hold reaister, the 
error signal from the address function module is inhibited since the 


contents of the CA field being used for comnarison are now susnect. 


Proper sequencing of accesses to the ROM is the most difficult 
check to accomplish. A description of the sequencing nrocess in gen- 
eral terms gives insight to the problem. The microprogram contained 
in the ROM consists, in effect, of a series of "Subroutines" in a 
lower level language (the micro-instruction set), one "subroutine" 
for each of the macro-instructions used to write the flight nroaram 
Stored in the core memory. The flight proaram instruction word's 
operation code field, representing the macro-instruction, is analo- 
gously used as the "call" statement for its "subroutine". Since 
the same micro-instructions may be used in different mix to implement 
different macro-instructions, the number of micro-instructions is, in 
general, smaller than the number of macro-instructions. 

Given a new flight program instruction word to be executed, the 
first access to the ROM is dictated by the oneration code field of 
the instruction. This operation code is decoded as a selection of one 
microword in the ROM. Subsequent accesses to the ROM until the "sub- 
routine" started by the operation code "call statement" is completed 
are dictated by the NA field of the microword itself. At the end 
of the sequence, the microword indicates that the sequence is comnlete 
and a new flight program instruction word is fetched by the FETCH 
CONTROL. Under certain conditions (such as reneats and branches), 
the repeat counter and condition code register dictate that the NA 
field be ignored and that the step counter (ROM address register) be 
incremented or decremented to indicate the next ROM address to be 
accessed. There are, then, several different sources of the next 


ROM address to be accessed: 


60 


1. The operations code field of the nrogram instruction word 
found in the instruction register (U,, U,, U, in Figure 6) 
dictates the initial access to the ROM if executina a given 
program instruction word. 


2. The NA field of the microword just accessed indicates the next 
ROM address to be accessed excent that: 


3. The repeat counter and condition code reaister can dictate 
direct modification of the step counter to yield the next ROM 
address to be accessed, in which case the NA field of the last 
microword accessed is ignored. 

The SEQUENCE CONTROL selects the proper source of the next ROM 
address to be accessed. It modifies the step counter as required by 
the repeat counter or condition code register, and selects the proper 
field (U, ; Us» or U) from the instruction reaqister dependent on 
whether half or full-word instructions are beina executed. When the 
NA field is selected as the source of the next address, its contents 
could be held in a separate register until they could be compared with 
the CA field of the microword actually accessed to see if a nroner 
accessing had occurred. However, because of the possible other sources 
of the next address, it anneared that the proner functioning of the 
SEQUENCE CONTROL, FETCH CONTROL, REPEAT COUNTER, and CONDITION CODE 
REGISTER could only be assured by duplication, narallel operation, and 
comparison for identical results. Only in this way did adequate con- 
tinuous checking of the proper sequencing to accesses seem feasible. 

While the duplication and comparison test method should be reserved 
for last consideration, as indicated in Section IV-C-8, its application 
to the small loaic sections described here appeared to be required to 
provide continuous checking. Controls which are duplicated and compared 
can be placed in any module as long as the dunlex circuitrv and com- 


parator are in the same module. Partitioning of this duplicated 


circuitry was therefore dependent on 2A NAFI module limttations only. 
The portion of Figure 7 labeled SEQUENCE CONTROL MODULE, then, could be 
broken into several modules with isolation of faults to the modular 


level still provided. 


D. THE CORE MEMORY UNIT 

Modification of an existing design to meet the requirements for 
a 24-bit word length, 8K core memory for the parent computer was con- 
Sidered. The use of an already developed memory design appeared 
favorable in light of the short schedule and low risk nature of the 
program. Although the final choice of memory type and size was depend- 
ent on changing requirements and therefore not firm, the example 
design will consider modifications of the basic design shown in Figure 9 
to provide a BIT capability with fault detection and isolation to the 
modular level as the goal. The memory to be modified, termed the 
“standard memory unit" (SMU), was a 3D, coincident current, 32-bit 
word lenath, random access, 4K core memory. The example used serves well 
to demonstrate the factors involved in memory test. 

Reference 35 briefly summarizes the standard techniques for func- 
tionally exercising a core memory. The functional exercisers listed 
below check for proper operation of the memory as a black-box without 
examining specific internal circuits. The standard functional exer- 
cisers are: 

1. Check-sum - checks proper memory loading. This check can be 


accomplished using the flight proaram and constants stored in 
the computer for the mission. 
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2. One's discrimination - checks memory's ability to read and 
write ones coreectly. Memory buffer registers, sense amnlifiers, 
the core array, and driving circuits are checked by this test. 


3. Zero's discrimination - checks the memory's ability to read 
and write zeros correctly. The driving circuits are checked 
by this test, as well as the sense amplifiers’ sensitivity to 
noise. 


4. Addressing - checks whether or not each memory location can be 
correctly accessed. In addition to those circuits tested by 
the discrimination tests, the memory selection logic, decoders 
and drivers are checked. 


5. Checkerboard and Inverted Checkerboard - these tests produce 
worst case noise conditions upon half-readv, which results in 
maximum inhibit noise whenever a zero is written. The inhibit 
noise from a cycle where zero was written can cause an error 
during the read portion of the next cycle. 

The discrimination and checkerboard tests are aimed at discovering 
marginal conditions, and were not considered appropriate for airborne 
testing. They would certainly be appropriate as part of pre- or post- 
flight checkout on the ground, as discussed earlier in relation to 
marginal testing in general. The check-sum and addressing tests, more 
suited to discovering existing solid failure, appeared to be anpro- 
priate for in-flight application. 

The five tests enumerated above are program-oriented, pveriodically 
exercised tests. Test techniques which require added hardware include 
coding and separate checking circuitry for each circuit tyne. Coding, 
principally parity, is popular for checking memories, but this tech- 
nique fell outside the program constraints for the examnole desian. 
Techniques for adding specialized circuitry to test the memory are 
described in Ch. 14 of Ref. 43. The additional exnense of the cir- 


cuitry and complete memory reconfiguration appeared inanpropriate for 


the design modification intended. 
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Modification of the modular partitioning of the SMU appeared 
necessary to facilitate the test desian if isolation to the modular 
level were to be accomplished. While packaged employing standard NAFI 
modules, the SMU did not use the 2A size, but rather the 1A and 1B 


=e 


The standard memory unit was implemented with the equivalent 
of 152 1A NAFI modules. As evident in Figure 9, partitioning was done 
by circuitry type; e.g., there are 16 1B size sense/inhibit modules, 
one 1A address register module, and so forth. Several modules, of 
different types, are involved in one memory access; an address req- 
ister module, address decoder module, timing control and timing 
modules, and sense/inhibit modules are all involved in one access. 

It is difficult to determine airborne in which module the fault lies 
once one is detected by a functional test alone. A unique way of 
applying functional tests and some added hardware were required to 
accomplish the modular isolation capability required. 

Sixteen 1B NAFI modules were used in the SMU to implement the 
sense/inhibit functions for the 32 memory planes (32-bit word length). 
This represents circuitry for two planes (bit locations) per 1B 
module. An estimate (based on area limitation because of an essen- 
tially discrete component implementation of sense/inhibit circuitry, 


and allowing for added checking hardware) indicated that eight 2A 


MThe number in the NAFI size designator refers to the horizontal 


dimension of area (width), while the letter refers to the thickness. 
1A is the smallest basic size, having unit standard width and unit 
standard thickness. The 2A module is twice as wide as the 1A and 
hence has twice the area, and the 1B is twice as thick as the IA 
[Ref. 10]. 
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modules should amply suffice to implement the sense/inhibit functions 
for a 24-bit word length. Three planes were to be served ver 2A module. 
It was envisioned that bit locations served by a module would be ad- 
jacent. Figure 10 illustrates the scheme. 

The implementation of the decode function for the SMU required two 
modules dedicated to X select and two to Y select, each module serving 
the entire core stack. An approach to partitioning which initially 
appeared attractive was to partition the decode logic so that the X 
and Y decode serving a smaller block of the core stack would be placed 
in the same module. However, partitioning the decode in effect doubles 
the logic required for every partitioning (e.g., placing the X and Y 
decode for one quarter of the core stack in one module would, for the 
entire core stack, entail quadrunlicating the logic). Dunlication and 
comparison required only twice as much decode logic, and this method 
was chosen. For example, the circuitry on one of the two X decode 1B 
modules is duplicated, the duplex hardware being nlaced in the same 
2A module. Figure 11 shows a decode module. Four 2A modules were 
required for decode in the example design. | 

The address register also required dunlication for separate test 
by the duplication and comparison technique. Checking of power supplies, 
transient protection, temperature tracking voltage sensors, timing, and 
associated regulators have been excluded from consideration, as thev 
are hard-core housekeeping and service functions. The major areas 
subject to failure during flight are the decoding, sense/inhibit and 
select lines, cores, drivers, and amnlifiers associated with accessing 


the memory, which are checked by the procedures described herein. 
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Fault isolation to the functional module level plus the core stack 
is provided by the test procedures described below. Faults occurring 
in the sense/inhibit functions are isolated to a single sense/inhibit 
module and core stack combination. Faults occurring in the addressina 
function are isolated to a single address register module or decode 
module. Faults occurring in the core stack are isolated to the core 
Stack only if all tests can be conducted. No airborne discrimination 
between a single sense/inhibit module and the core stack anneared feasi- 
ble if the sense/inhibit test failed because later tests could not 
then be confidently conducted. Such discrimination is easily 
accomplished on the ground. While a higher degree of isolation would 
be preferable, the level provided airborne closely focuses the efforts 
of maintenance personnel and greatly reduces the time/cost of mainte- 
nance. Sub core-stack isolation would probably not be useful since the 
core stack must be treated as an entity by maintenance personnel. 

Testing of the sense/inhibit functions should precede testina of 
the decode function to insure that the latter tests are valid when 
conducted. The sense/inhibit functions serve the entire core stack; 
that is, a single sense amplifier & a sinale inhibit driver serve the 
same bit location in all the 8K words of the core stack. Each access 
to the core memory exercises all the sense/inhibit circuitry since 
all the bit locations of the word are involved. Solid failures result 
in a stuck-at-one or stuck-at-zero condition in a bit location. To 
isolate such fault manifestations to the sense/inhibit module or the 
core stack serving the bit location, one must first detect the fault 
and then relate it to the proper module. The test consists of attempt- 


ing to access a core memory location which contains a nreviously 
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stored constant, A location containing all one's tests for the stuck- 
at-zero condition. Another location containing all zero's tests for 
the stuck-at-one condition. Two core memory locations are therefore 
dedicated for test use, one containing all one's and the other al] 
zero's. A second set of such tests using the same cells should be 
performed to verify the restore oneration; however, discrimination 
between failures in the sense/inhibit module and the core stack would 
Still not be nrovided because of the nossibility of a broken sense line 
(which also looks like a sense amplifier stuck-at-zero). Relating the 
failure to a specific sense/inhibit module is accomnlished by checking 
hardware added to each module. Assuming eight sense/inhibit modules 
with three-bit locations served per module (24-bit word), one adds a 
three-bit register to each module (that is, in effect, a nartitioned 
output buffer register for the core memory). A three-bit comnarator 
(XOR) senses the failed condition when the three-bit locations are not 
identical. For examnle, stuck-at-one failure in the fourth bit loca- 
tion would be detected by accessing the memory location containina 

all zero's. The three-bit register of the second sense/inhibit module 
(servina the second three-bit groun of the 24-bit word) would read 
100 , producing an error signal from the XOR circuit on the module. 
Figure 12 shows the configuration of the sense/inhibit module. 

The exercising procedure for the decode function and the core 
stack consists of check-summing over sections of memory. The core 
memory contains the stored nrogram and constants (unalterable part of 
memorv) which cannot change during flights, and a small section 


(scratch pad) reserved for storage of data which can chanae in-flight. 
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Scratch pad test will be discussed separately. Check-summina is 
accomplished by cumulatively adding the contents of all the cells 
of the unalterable part of memory modulo 24, the final sum accruing in 
the accumulator. | The exnected check-sum (ECS) for the unalterable 
part of memory has been previously calculated externally and stored in 
the memory as a constant. Coincidence of the calculated sum and the 
ECS (subtraction is often used to give an expected zero result) indi- 
cates not onlv that the program stored in that part of memorv is intact, 
but also that the accessing process has been properly accomnlished. 
Sequential access to each cell of the segment is attemnted during 
calculation of the sum; the sum will check with the ECS only if every 
access has been properly executed. The accessing process thorouahly 
exercises the core stack and its associated decode modules. Isolation 
of faults to the decode module (by its internal comparator) or to the 
core stack (by an incorrect check-sum) is thereby provided without 
separate addressing tests, modification of cell contents, or storaae 
of any test results. The ECS can be stored at the end of the unalter- 
able part of memory. The core memory can also contain the memory test 
program for check-summing, at the price of a few cells of core stor- 
age. Tne memory test program can also be microprogrammed in the ROM 
with other test sequences, and this alternative is nreferable if 


sufficient ROM snace is available. It has been imnlicit throughout 


an schemes of handlina the carry out of the most siaqnifi- 
cant nlace (e.q., addition to the least significant bit location) 
reduce the nrobability of obtaining a proper check sum when failure 
exists to a negligibly low value. 
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the foregoing test procedure description that the control and processor 
units have been tested prior to memory checking so that they can be 
validly used to calculate the check-sums and do comparisons. 

The scratch pad is tested last, and it must be treated somewhat 
differently, since its contents can change during the mission. Conse- 
quently, an ECS could not be calculated and stored earlier for comoari- 
son. In addition, there will be some data stored in scratch pad which 
cannot be destroyed during test mode; e.g., vositional data. The same 
cneck-sum test technique can, however, still be annlied if a smal] 
block of scratch pad cells (block A cells in Fiqure 13) can be altered 
during test. A like-sized block of stored proaram cells in the unalter- 
able part of memory (block B cells in Fiaure 13) is identified and its 
ECS externally calculated and stored as a constant prior to flight. 
Figure 13 illustrates the checking procedure for a 1K scratch pad. 256 
words of the scratch pad can be altered (block A cells). The sequence 
of steos to test the IK scratch vad is listed below: 


1. Write contents of block B cells into block A. 
Check-sum block A and comoare to previously stored ECS. 


Write unalterable scratch pad data of block C cells into block 
A for temporary storage (block A cells and associated decode 
modules have been verified by steps 1 and 2). 


Write cofvents*of block B ce"ls Wito Block C. 
Check-sum block C and compare to previously stored ECS. 
Restore data temporarily stored in block A into block C. 


“~S DD ONO LS 


Continue the procedure with blocks D and E to complete scratch 
pad test. 


Note should be made that the size of block A can be quite small, 
if necessary, with resulting increase in the number of data shuffles 


required to comnlete scratch nad test. Alternate techniques to test 
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the scratch pad include coding, addition of more hardware, or nerhans 
acceptance of an untested scratch pad in consonance with reasonable 


test objectives discussed earlier. 


E. TESTING THE CHECKING HARDWARE 

The checking hardware represents hard-core circuitrv whose proper 
functioning must be assured before test results are considered valid. 
The failure of checking circuitry can lead to the verv undesirable 
indication of error when none exists, or failure to flaq existing 
error. To provide assurance that checking hardware is fault-free, one 
can 


1. Provide redundant circuitry with reliability an order of 
magnitude higher than the circuitry it checks. 


2. Provide some earlier periodic check to verify nroper 
operation before test commences. 


3. Verify only during periodic maintenance periods. 
The first alternative tends to be too expensive, at least doubling 
the hardware cost of providing built-in test. The third alternative 
reduces confidence in the test results to an unacceptably low level. 
A veriodic gross functional check of the checking circuitry is probably 
most feasible, but at the expense of a few words of core storage. 
Test bit patterns stored in core-memory can be used to initialize the 
circuitry so that the left and right half-words will differ. Error 
therefore should be indicated. Identical half-word vatterns can be 
introduced, in which no error should be signalled. Such tests can be 
made part of the periodic test sequence vreceding test of the rest of 
the computer. While it is recognized that comprehensive test has not been 
achieved, one can be assured of a high degree of confidence in the 


checking circuitry for minimal cost and effort. 
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F. SEQUENCE OF TESTING 

The sequence in which testing should be conducted for the narent 
computer has been indicated in the senarate sections. A summary is 
useful to gain better persnective. For those portions neriodically 
tested, the priority should be: 


1. Preflight marginal checks. 
2. The checking circuitry (gross functional check). 
3. The processor unit. 
4. The core memory. 
a. Sense/inhibit function 
b. The core stack (check-sum) 
Co@Seratchrpad 


Those portions tested continuously include: 


1. Hard-core housekeeping and service functions (nower supnlies, 
clock, and so forth) 


Miemeonerol sun & 

Core memory (partially) 
a. Address register 

b. Decode function 


G. PROCESSING OF ERROR SIGNALS 

Some general comments should be made relative to the handling of 
error signals once issued. If the goal of providing a senarate error 
Siqnal from each module of the comouter is achieved, a larae number 
of sources will be reporting. The renorts must be intervreted and 
processed to achieve the desired test goals. 

First, the signal lines should be made "fail-safe"; that is, a 


voltage should be present on each line excent when it is reporting 


failure. In this way, the line itself is checked since the absence 
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of a voltage will lead to investigation of the cause. The problem of 
errors propagating from module to module, aqiving several false error 
Signals in addition to the accurate error sianal, has been resolved 
locally in the modules by error-inhibit precautions, as in the general 
processor module checking circuitry. An error signal transmission path 
should be provided separate from other computer output Paths, and 

by the most direct route to allow signals to be communicated under a 
failed condition. The problem of signal interpretation remains to be 
resolved. 

A reasonable number of 128 modules with separate error lines will 
be assumed. By the single failure assumption, only one of the 128 lines 
will siqnal error at one time. With 80 pins limiting the 2A NAFI 
module, two separate error processing modules would be necessary to 
accommodate the required error inputs. Sixty-four error lines would 
then input to each module, well within the 80 vin limitation. Encoding 
circuitry in each module would encode the error source into binarv 
code, each error line having a unique binary number identifying it. 
Seven outout lines, tnen, would be necessary from each module, six to 
encode one of 64, and one to indicate which module was sending the en- 
coded error message giving a resolving power of one in 128. A total 
of 71 input and output lines for each module, plus required power 
supply and timing inputs, appears reasonable relative to pin limita- 
tions. The encoded message would then be routed by direct means to a 
central buffer register where the messane of error location would be 
preserved by some recording means for later use by maintenance pver- 


sonnel. The message could also be used to turn off the central nower 


Te 


source to avoid the use of contaminated comnutations. The nilot would 
be notified of error in accordance with test goals. Care would have 
to be taken to ensure that failures in checking hardware, detected 
during pre-test periodic check, did not initiate comnuter shutdown. 

In such cases, notification to the pilot that the error checking 
capability of the computer had failed would allow him to continue 


its use knowledgeable of the attendant risk. 
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VI. DESIGN EVALUATION 


Any test design is subject to the unique limitations imnosed 


by the parent design program, and the example used was no exception. 


The design presented achieves the reasonable objectives established 


for it in almost all instances: 


de 


A thorough self-test capability is provided for the parent 
computer in the airborne environment with a high confidence 
level for the test results. The risk of undetected error is 
kept negligibly low. 


The test design represents a unique series of tradeoffs, 
optimizing the test performance per dollar for the short 
schedule, low risk program. Maximum advantage was taken of 
proposed architectural characteristics for the machine. 

The hardware-software split duplication technique and the 
pronosed modification of an existing memory design illustrate 
ENTS. 


Partitioning of the computer was achieved using the specified 
NAFI 2A module. Detection and isolation of the most important 
classes of faults to this modular level is automatically 
provided. This capability was achieved while allowing for 
flexible word length with minimal basic design changes. In 
the highly regular pvrocessor and memory units, the number of 
different module types was kept favorably low. 


Redundancy was not qenerallv used. The capability of signif- 
icant test performance is provided for considerably less 
than duplication of hardware. 


The test desiaqn required verv few cells of core storaaqe, 

such requirements being limited to a few constants and possibly 
a memory test routine of short lenqth. A simple nseudo- 

random number qenerator to nrovide test but natterns was 
substituted for a large number of stored constants. The 

coding techniques used required no core storage, leaving 
maximum word length available for onerational use. Dissoci- 
ating the core memory from the processor and control units 
simplified the overall test problem. 


Operational degradation was minimized throughout. An inter- 
ruptable microprogrammed routine using idle time and executed 
at read only memory cycle speed provides valid test without 
infringing on operational availability. 
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Assignment of a specific figure of merit to the test design 
must await choice of specific hardware, and the imnortant micro- 


programming of the test routine upon which much of the potential 
test performance is predicated. 


Various figures of merit can be assigned to a test design. Davis 
[Ref. 9] developed a formula to assign a figure of merit to his 
residue code arithmetic unit test scheme. Other fiaures relating to 
cost, such as the 10% added hardware figure mentioned earlier, or in 
more absolute terms the cost of BIT per gate tested have been assigned. 
The ultimate justification for a self-test capability is its measured 
performance in detecting errors. A high confidence level that a 
high percentage of potential failure sources have been checked seems 
to the author to be the best figure of merit. 

Evaluation of a self-test canabilitv can be accomnlished in 
several ways. One technique which allows such evaluation is simula- 
tion, during which faults can be artificially duplicated to verify 
expected test resnonse. Once the comouter is built, actual faults can 
be injected and the response measured. Failure history for a nroduc- 
tion machine can also helo in evaluating test efficiency. A full- 
scale simulation of the parent comnuter with self-test circuitrv was 
envisioned. 

The example design promises to nrovide significantly more test 
capability per dollar than previous designs for similar comnuters. Its 
potential beneficial effect on overall cost of ownershin makes the 
self-test capability provided by the desian a very attractive feature. 
Recoanition of this fact should certainly result in areater future 


emphasis on the relatively new field of built-in self-test design. 


VIT. SUGGESTED FURTHER INVESTIGATION 


The subject of derivation of an optimal test routine using the 
micro-instruction set is an interesting one for future work. Many 
techniques, some briefly presented herein, suggest ways in which the 
states of a block of logic can be identified and related to the micro- 
instructions. Additionally, special instructions for test use only 
can be formulated, as needed. Computer-aided design fits well in this 
category. 

Once error signals from each module can be provided, the subject 
of automatic reconfiguration for continued operation after failure can 
be addressed. Ideally, the error signal from a "bad" module would be 
used to turn off the bad module and switch in a substitutina module. 
For example, in the processor, the three identical modules of a set 
could be joined by a fourth identical module to be used in the event 
of failure. The ability to add such a reconfiguration capability in 
modular form might prove to be an attractive option available at extra 
cost dependent on the computer's intended use. 

The ability of a computer to continue to operate after failure in 
a degraded mode using its remaining unfailed circuitry might be inves- 
tigated. For example, limited operations might continue at a slower 
speed for high priority tasks related to aircraft survival (e.a., 
electronic countermeasures and navigation. ) 

Lastly, the effects of continued technological advance on test 
design and self-repair offer fruitful subjects for further investi- 


aqation. 
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APPENDIX A — PSEUDO-RANDOM NUMBER GENERATOR 


The pseudo-random number generator shown below aenerates the 


maximum length sequence of gle 


different 12-bit binary patterns. The 
numbers so produced are random in each bit position. The aenerator 


implements the modulo 2 irreducible polynomial 


yl2 + x0 7 ve +x 4] 


as a linear feedback shift register. A different pattern is produced 
at each clock pulse. The nonlinearity of the all-zero case is added 
by the 11 - input NAND gate (which, of course, can be implemented as 


several gates instead of one). 
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